A general compression algorithm that supports fast searching
نویسندگان
چکیده
The task of compressed pattern matching [2] is to report all the occurences of a given pattern P in a text T available in compressed form. Certain compression algorithms allow for searching without prior decoding which may be practical, especially if the search is faster than in the non-compressed representation. Most of the known schemes, however, either assume a text formed into words, or are complex and rather theoretical.
منابع مشابه
Fast Relative Lempel-Ziv Self-index for Similar Sequences
Recent advances in biotechnology and web technology are generating huge collections of similar strings. People now face the problem of storing them compactly while supporting fast pattern searching. One compression scheme called relative Lempel-Ziv compression uses textual substitutions from a reference text as follows: Given a (large) set S of strings, represent each string in S as a concatena...
متن کاملFaster Fractal Image Coding Using Similarity Search in a KL-transformed Feature Space
Fractal coding is an eecient method of image compression but has a major drawback: the very slow compression phase, due to a time-consuming similarity search between image blocks. A general acceleration method based on feature vectors is described, of which we can nd many instances in the literature. This general method is then optimized using the well-known Karhunen-Loeve expansion, allowing o...
متن کاملA General Practical Approach to PatternMatching over Ziv - Lempel Compressed
We address the problem of string matching on Ziv-Lempel compressed text. The goal is to search a pattern in a text without un-compressing it. This is a highly relevant issue to keep compressed text databases where eecient searching is still possible. We develop a general technique for string matching when the text comes as a sequence of blocks. This abstracts the essential features of Ziv-Lempe...
متن کاملA General Practical Approach to Pattern Matching over Ziv-Lempel Compressed Text
We address in this paper the problem of string matching on Lempel-Ziv compressed text. The goal is to search a pattern in a text without uncompressing. This is a highly relevant issue, since it is essential to have compressed text databases where eecient searching is still possible. We develop a general technique for string matching when the text comes as a sequence of blocks. This abstracts th...
متن کاملSearching for Unique DNA Sequences with the Burrows-Wheeler Transform
The objective of this study was to present an efficient algorithm that effectively aids the problem of searching for unique DNA sequences in the set of genes. The presented algorithm is based on the Burrows-Wheeler Transform (BWT), a very fast and effective data compression algorithm. The developed algorithm exploits all the advantages offered by the BWT algorithm and the suffix array data stru...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Process. Lett.
دوره 100 شماره
صفحات -
تاریخ انتشار 2006